Arabian Gulf
Discovering Governing Equations of Geomagnetic Storm Dynamics with Symbolic Regression
Markidis, Stefano, Ekelund, Jonah, Pennati, Luca, Hu, Andong, Peng, Ivy
Geomagnetic storms are large-scale disturbances of the Earth's magnetosphere driven by solar wind interactions, posing significant risks to space-based and ground-based infrastructure. The Disturbance Storm Time (Dst) index quantifies geomagnetic storm intensity by measuring global magnetic field variations. This study applies symbolic regression to derive data-driven equations describing the temporal evolution of the Dst index. We use historical data from the NASA OMNIweb database, including solar wind density, bulk velocity, convective electric field, dynamic pressure, and magnetic pressure. The PySR framework, an evolutionary algorithm-based symbolic regression library, is used to identify mathematical expressions linking dDst/dt to key solar wind. The resulting models include a hierarchy of complexity levels and enable a comparison with well-established empirical models such as the Burton-McPherron-Russell and O'Brien-McPherron models. The best-performing symbolic regression models demonstrate superior accuracy in most cases, particularly during moderate geomagnetic storms, while maintaining physical interpretability. Performance evaluation on historical storm events includes the 2003 Halloween Storm, the 2015 St. Patrick's Day Storm, and a 2017 moderate storm. The results provide interpretable, closed-form expressions that capture nonlinear dependencies and thresholding effects in Dst evolution.
- North America > United States (0.35)
- Europe > Sweden > Stockholm > Stockholm (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- Energy (0.93)
- Government > Space Agency (0.35)
- Government > Regional Government > North America Government > United States Government (0.35)
Non-IID data in Federated Learning: A Survey with Taxonomy, Metrics, Methods, Frameworks and Future Directions
G., Daniel M. Jimenez, Solans, David, Heikkila, Mikko, Vitaletti, Andrea, Kourtellis, Nicolas, Anagnostopoulos, Aris, Chatzigiannakis, Ioannis
Recent advances in machine learning have highlighted Federated Learning (FL) as a promising approach that enables multiple distributed users (so-called clients) to collectively train ML models without sharing their private data. While this privacy-preserving method shows potential, it struggles when data across clients is not independent and identically distributed (non-IID) data. The latter remains an unsolved challenge that can result in poorer model performance and slower training times. Despite the significance of non-IID data in FL, there is a lack of consensus among researchers about its classification and quantification. This technical survey aims to fill that gap by providing a detailed taxonomy for non-IID data, partition protocols, and metrics to quantify data heterogeneity. Additionally, we describe popular solutions to address non-IID data and standardized frameworks employed in FL with heterogeneous data. Based on our state-of-the-art survey, we present key lessons learned and suggest promising future research directions.
- North America > United States (0.04)
- Europe > Spain (0.04)
- Europe > Italy (0.04)
- (8 more...)
- Overview (1.00)
- Research Report > Promising Solution (0.87)
- Information Technology > Security & Privacy (1.00)
- Education (1.00)
- Health & Medicine > Diagnostic Medicine (0.92)
A deep learning framework for jointly extracting spectra and source-count distributions in astronomy
Wolf, Florian, List, Florian, Rodd, Nicholas L., Hahn, Oliver
Astronomical observations typically provide three-dimensional maps, encoding the distribution of the observed flux in (1) the two angles of the celestial sphere and (2) energy/frequency. An important task regarding such maps is to statistically characterize populations of point sources too dim to be individually detected. As the properties of a single dim source will be poorly constrained, instead one commonly studies the population as a whole, inferring a source-count distribution (SCD) that describes the number density of sources as a function of their brightness. Statistical and machine learning methods for recovering SCDs exist; however, they typically entirely neglect spectral information associated with the energy distribution of the flux. We present a deep learning framework able to jointly reconstruct the spectra of different emission components and the SCD of point-source populations. In a proof-of-concept example, we show that our method accurately extracts even complex-shaped spectra and SCDs from simulated maps.
- North America > United States (0.14)
- Europe > Austria > Vienna (0.05)
- North America > Canada > Alberta > Census Division No. 13 > Woodlands County (0.04)
- Asia > Middle East > Qatar > Arabian Gulf (0.04)
Conditionally Calibrated Predictive Distributions by Probability-Probability Map: Application to Galaxy Redshift Estimation and Probabilistic Forecasting
Dey, Biprateep, Zhao, David, Newman, Jeffrey A., Andrews, Brett H., Izbicki, Rafael, Lee, Ann B.
Uncertainty quantification is crucial for assessing the predictive ability of AI algorithms. Much research has been devoted to describing the predictive distribution (PD) $F(y|\mathbf{x})$ of a target variable $y \in \mathbb{R}$ given complex input features $\mathbf{x} \in \mathcal{X}$. However, off-the-shelf PDs (from, e.g., normalizing flows and Bayesian neural networks) often lack conditional calibration with the probability of occurrence of an event given input $\mathbf{x}$ being significantly different from the predicted probability. Current calibration methods do not fully assess and enforce conditionally calibrated PDs. Here we propose \texttt{Cal-PIT}, a method that addresses both PD diagnostics and recalibration by learning a single probability-probability map from calibration data. The key idea is to regress probability integral transform scores against $\mathbf{x}$. The estimated regression provides interpretable diagnostics of conditional coverage across the feature space. The same regression function morphs the misspecified PD to a re-calibrated PD for all $\mathbf{x}$. We benchmark our corrected prediction bands (a by-product of corrected PDs) against oracle bands and state-of-the-art predictive inference algorithms for synthetic data. We also provide results for two applications: (i) probabilistic nowcasting given sequences of satellite images, and (ii) conditional density estimation of galaxy distances given imaging data (so-called photometric redshift estimation). Our code is available as a Python package https://github.com/lee-group-cmu/Cal-PIT .
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- (8 more...)
- Energy (1.00)
- Government > Regional Government > North America Government > United States Government (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Tropical Cyclone Track Forecasting using Fused Deep Learning from Aligned Reanalysis Data
Giffard-Roisin, Sophie, Yang, Mo, Charpiat, Guillaume, Kumler-Bonfanti, Christina, Kégl, Balázs, Monteleoni, Claire
The forecast of tropical cyclone trajectories is crucial for the protection of people and property. Although forecast dynamical models can provide high-precision short-term forecasts, they are computationally demanding, and current statistical forecasting models have much room for improvement given that the database of past hurricanes is constantly growing. Machine learning methods, that can capture non-linearities and complex relations, have only been scarcely tested for this application. We propose a neural network model fusing past trajectory data and reanalysis atmospheric images (wind and pressure 3D fields). We use a moving frame of reference that follows the storm center for the 24h tracking forecast. The network is trained to estimate the longitude and latitude displacement of tropical cyclones and depressions from a large database from both hemispheres (more than 3000 storms since 1979, sampled at a 6 hour frequency). The advantage of the fused network is demonstrated and a comparison with current forecast models shows that deep learning methods could provide a valuable and complementary prediction. Moreover, our method can give a forecast for a new storm in a few seconds, which is an important asset for real-time forecasts compared to traditional forecasts.
- North America > United States > Colorado > Boulder County > Boulder (0.28)
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (3 more...)